Added support to rotate in fp32 (optional)#885
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis change introduces granular rotation configuration for the quantization preprocessing pipeline. The rotate parameter now accepts either a boolean or a dictionary with "enable" and "rotate_fp32" keys, enabling independent control of rotation functionality and floating-point precision casting during Hadamard transforms. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~15 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #885 +/- ##
==========================================
- Coverage 73.11% 73.09% -0.02%
==========================================
Files 205 205
Lines 22281 22292 +11
==========================================
+ Hits 16291 16295 +4
- Misses 5990 5997 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
57f1766 to
044a4ab
Compare
modelopt-bot
left a comment
There was a problem hiding this comment.
Code Review: PR #885 - Added support to rotate in fp32 (optional)
Overview
This PR adds an optional FP32 precision mode for the Hadamard rotation transform used in RHT (Rotated Hyperplane Transform) quantization methods like QuaRot/SpinQuant. The change is backward compatible.
✅ Strengths
| Aspect | Assessment |
|---|---|
| Backward Compatibility | Excellent. The rotate config now accepts bool | dict[str, bool] — existing boolean configs continue to work unchanged. |
| Code Organization | Clean property abstraction in tensor_quantizer.py — rotate_is_enabled and rotate_is_fp32 handle the type polymorphism well. |
| Test Coverage | test_kv_rotate is now parametrized for both rotate=True and rotate={"enable": True, "rotate_fp32": True}. Also added FP32 test in test_hadamard_transform. |
| Documentation | Changelog updated, config docstrings are clear, and the PR description includes usage examples. |
| Typo Fix | Caught and fixed "ratation" → "rotation" in the config description. |
⚠️ Points to Consider
1. Numerical Precision in Mixed Scenarios
In functional.py, the sqrt calculation uses torch.float32 tensor, but if rotate_fp32=False and input is FP16/BF16, the division happens in the input's dtype. This is existing behavior, but worth noting that the denominator's precision differs based on mode.
2. Property Logic in rotate_is_fp32
return (
self._rotate.get("rotate_fp32", False)
if isinstance(self._rotate, dict) and self.rotate_is_enabled
else False
)The check self.rotate_is_enabled inside rotate_is_fp32 is redundant since you already check isinstance(self._rotate, dict). If it's a dict with "enable": False, returning False either way is fine, but consider simplifying to just check isinstance.
3. Test Assertions
Both FP32 and non-FP32 tests use the same atol=0.05. This is reasonable given the comment about 16-bit float errors, but consider if tighter tolerance is expected for FP32 mode.
📝 Minor Nits
- The
CHANGELOG.rstentry is brief. Consider slightly more detail: "Add support for FP32 rotation computation in RHT quantization (QuaRot/SpinQuant) when specified in rotation config."
Verdict
LGTM — This is a solid, well-tested feature addition with proper backward compatibility. The code is clean and the documentation is thorough. Ready to merge after addressing any optional nits.
modelopt-bot
left a comment
There was a problem hiding this comment.
See detailed feedback in inline comments below.
Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>
Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>
Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>
9920a1e to
126570f
Compare
What does this PR do?
Type of change: New Feature
Overview:
This MR adds support to perform rotation for RHT in float32 if enabled by quantization configuration. It also makes rotate argument in quantization configuration of type bool (for backward compatibility) or dict (added option for float32 rotation)
Usage
Updated
NVFP4_KV_ROTATE_CFGlocally with"rotate": {"enable": True, "rotate_fp32": True}Updated
NVFP4_KV_ROTATE_CFGlocally with"rotate": {"enable": True, "rotate_fp32": False}Testing
Updated unit test in
tests/gpu/torch/quantization/test_hadamard.pyBefore your PR is "Ready for review"
Additional Information
Summary by CodeRabbit
Release Notes
New Features
Tests